Goto

Collaborating Authors

 Guelph




Can synthetic data reproduce real-world findings in epidemiology? A replication study using tree-based generative AI

arXiv.org Machine Learning

Generative artificial intelligence for synthetic data generation holds substantial potential to address practical challenges in epidemiology. However, many current methods suffer from limited quality, high computational demands, and complexity for non-experts. Furthermore, common evaluation strategies for synthetic data often fail to directly reflect statistical utility. Against this background, a critical underexplored question is whether synthetic data can reliably reproduce key findings from epidemiological research. We propose the use of adversarial random forests (ARF) as an efficient and convenient method for synthesizing tabular epidemiological data. To evaluate its performance, we replicated statistical analyses from six epidemiological publications and compared original with synthetic results. These publications cover blood pressure, anthropometry, myocardial infarction, accelerometry, loneliness, and diabetes, based on data from the German National Cohort (NAKO Gesundheitsstudie), the Bremen STEMI Registry U45 Study, and the Guelph Family Health Study. Additionally, we assessed the impact of dimensionality and variable complexity on synthesis quality by limiting datasets to variables relevant for individual analyses, including necessary derivations. Across all replicated original studies, results from multiple synthetic data replications consistently aligned with original findings. Even for datasets with relatively low sample size-to-dimensionality ratios, the replication outcomes closely matched the original results across various descriptive and inferential analyses. Reducing dimensionality and pre-deriving variables further enhanced both quality and stability of the results.


Electric Bus Charging Schedules Relying on Real Data-Driven Targets Based on Hierarchical Deep Reinforcement Learning

arXiv.org Artificial Intelligence

A Markov Decision Process (MDP) is conceived, where the time horizon includes multiple charging and operating periods in a day, while each period is further divided into multiple time steps. To overcome the challenge of long-range multi-phase planning with sparse reward, we conceive Hierarchical DRL (HDRL) for decoupling the original MDP into a high-level Semi-MDP (SMDP) and multiple low-level MDPs. The Hierarchical Double Deep Q-Network (HDDQN)-Hindsight Experience Replay (HER) algorithm is proposed for simultaneously solving the decision problems arising at different temporal resolutions. As a result, the high-level agent learns an effective policy for prescribing the charging targets for every charging period, while the low-level agent learns an optimal policy for setting the charging power of every time step within a single charging period, with the aim of minimizing the charging costs while meeting the charging target. It is proved that the flat policy constructed by superimposing the optimal high-level policy and the optimal low-level policy performs as well as the optimal policy of the original MDP . Since jointly learning both levels of policies is challenging due to the non-stationarity of the high-level agent and the sampling inefficiency of the low-level agent, we divide the joint learning process into two phases and exploit our new HER algorithm to manipulate the experience replay buffers for both levels of agents. Numerical experiments are performed with the aid of real-world data to evaluate the performance of the proposed algorithm. Index T erms Deep Reinforcement Learning; Electric Bus; Hierarchical Reinforcement learning; Charging Control A CRONYMS DDQN Double Deep Q Network DNN Deep Neural Network DQL Double Q-learning DQN Deep Q Network DRL Deep Reinforcement Learning EB Electric Bus EBCSP Electric Bus Charging Scheduling Problem GA Genetic Algorithm GHG GreenHouse Gas HAC Hierarchical Actor-Critic HDDQN Hierarchical Double Deep Q Network HDRL Hierarchical Deep Reinforcement Learning HER Hindsight Experience Replay HRL Hierarchical Reinforcement Learning IA Immune Algorithm MDP Markov Decision Process MILP Mixed Integer Linear Programming ML Machine Learning MRP Markov Reward Process RES Renewable Energy Source RL Reinforcement Learning RO Robust Optimization RTEM Real Time Energy Market 1 J. Qi and L. Lei are with the School of Engineering, University of Guelph, Guelph, ON N1G 2W1, Canada, jiaju@uoguelph.ca;


A Kolmogorov-Arnold Network for Explainable Detection of Cyberattacks on EV Chargers

arXiv.org Artificial Intelligence

The increasing adoption of Electric Vehicles (EVs) and the expansion of charging infrastructure and their reliance on communication expose Electric Vehicle Supply Equipment (EVSE) to cyberattacks. This paper presents a novel Kolmogorov-Arnold Network (KAN)-based framework for detecting cyberattacks on EV chargers using only power consumption measurements. Leveraging the KAN's capability to model nonlinear, high-dimensional functions and its inherently interpretable architecture, the framework effectively differentiates between normal and malicious charging scenarios. The model is trained offline on a comprehensive dataset containing over 100,000 cyberattack cases generated through an experimental setup. Once trained, the KAN model can be deployed within individual chargers for real-time detection of abnormal charging behaviors indicative of cyberattacks. Our results demonstrate that the proposed KAN-based approach can accurately detect cyberattacks on EV chargers with Precision and F1-score of 99% and 92%, respectively, outperforming existing detection methods. Additionally, the proposed KANs's enable the extraction of mathematical formulas representing KAN's detection decisions, addressing interpretability, a key challenge in deep learning-based cybersecurity frameworks. This work marks a significant step toward building secure and explainable EV charging infrastructure.


Quantifying Security Vulnerabilities: A Metric-Driven Security Analysis of Gaps in Current AI Standards

arXiv.org Artificial Intelligence

As AI systems integrate into critical infrastructure, security gaps in AI compliance frameworks demand urgent attention. This paper audits and quantifies security risks in three major AI governance standards: NIST AI RMF 1.0, UK's AI and Data Protection Risk Toolkit, and the EU's ALTAI. Using a novel risk assessment methodology, we develop four key metrics: Risk Severity Index (RSI), Attack Potential Index (AVPI), Compliance-Security Gap Percentage (CSGP), and Root Cause Vulnerability Score (RCVS). Our analysis identifies 136 concerns across the frameworks, exposing significant gaps. NIST fails to address 69.23 percent of identified risks, ALTAI has the highest attack vector vulnerability (AVPI = 0.51) and the ICO Toolkit has the largest compliance-security gap, with 80.00 percent of high-risk concerns remaining unresolved. Root cause analysis highlights under-defined processes (ALTAI RCVS = 033) and weak implementation guidance (NIST and ICO RCVS = 0.25) as critical weaknesses. These findings emphasize the need for stronger, enforceable security controls in AI compliance. We offer targeted recommendations to enhance security posture and bridge the gap between compliance and real-world AI risks.


Closing the Responsibility Gap in AI-based Network Management: An Intelligent Audit System Approach

arXiv.org Artificial Intelligence

Existing network paradigms have achieved lower downtime as well as a higher Quality of Experience (QoE) through the use of Artificial Intelligence (AI)-based network management tools. These AI management systems, allow for automatic responses to changes in network conditions, lowering operation costs for operators, and improving overall performance. While adopting AI-based management tools enhance the overall network performance, it also introduce challenges such as removing human supervision, privacy violations, algorithmic bias, and model inaccuracies. Furthermore, AI-based agents that fail to address these challenges should be culpable themselves rather than the network as a whole. To address this accountability gap, a framework consisting of a Deep Reinforcement Learning (DRL) model and a Machine Learning (ML) model is proposed to identify and assign numerical values of responsibility to the AI-based management agents involved in any decision-making regarding the network conditions, which eventually affects the end-user. A simulation environment was created for the framework to be trained using simulated network operation parameters. The DRL model had a 96% accuracy during testing for identifying the AI-based management agents, while the ML model using gradient descent learned the network conditions at an 83% accuracy during testing.


Towards an Ontology of Traceable Impact Management in the Food Supply Chain

arXiv.org Artificial Intelligence

The pursuit of quality improvements and accountability in the food supply chains, especially how they relate to food-related outcomes, such as hunger, has become increasingly vital, necessitating a comprehensive approach that encompasses product quality and its impact on various stakeholders and their communities. Such an approach offers numerous benefits in increasing product quality and eliminating superfluous measurements while appraising and alleviating the broader societal and environmental repercussions. A traceable impact management model (TIMM) provides an impact structure and a reporting mechanism that identifies each stakeholder's role in the total impact of food production and consumption stages. The model aims to increase traceability's utility in understanding the impact of changes on communities affected by food production and consumption, aligning with current and future government requirements, and addressing the needs of communities and consumers. This holistic approach is further supported by an ontological model that forms the logical foundation and a unified terminology. By proposing a holistic and integrated solution across multiple stakeholders, the model emphasizes quality and the extensive impact of championing accountability, sustainability, and responsible practices with global traceability. With these combined efforts, the food supply chain moves toward a global tracking and tracing process that not only ensures product quality but also addresses its impact on a broader scale, fostering accountability, sustainability, and responsible food production and consumption.


AI-Driven Cyber Threat Intelligence Automation

arXiv.org Artificial Intelligence

This study introduces an innovative approach to automating Cyber Threat Intelligence (CTI) processes in industrial environments by leveraging Microsoft's AI-powered security technologies. Historically, CTI has heavily relied on manual methods for collecting, analyzing, and interpreting data from various sources such as threat feeds. This study introduces an innovative approach to automating CTI processes in industrial environments by leveraging Microsoft's AI-powered security technologies. Historically, CTI has heavily relied on manual methods for collecting, analyzing, and interpreting data from various sources such as threat feeds, security logs, and dark web forums -- a process prone to inefficiencies, especially when rapid information dissemination is critical. By employing the capabilities of GPT-4o and advanced one-shot fine-tuning techniques for large language models, our research delivers a novel CTI automation solution. The outcome of the proposed architecture is a reduction in manual effort while maintaining precision in generating final CTI reports. This research highlights the transformative potential of AI-driven technologies to enhance both the speed and accuracy of CTI and reduce expert demands, offering a vital advantage in today's dynamic threat landscape.


Systematic Literature Review of Vision-Based Approaches to Outdoor Livestock Monitoring with Lessons from Wildlife Studies

arXiv.org Artificial Intelligence

Precision livestock farming (PLF) aims to improve the health and welfare of livestock animals and farming outcomes through the use of advanced technologies. Computer vision, combined with recent advances in machine learning and deep learning artificial intelligence approaches, offers a possible solution to the PLF ideal of 24/7 livestock monitoring that helps facilitate early detection of animal health and welfare issues. However, a significant number of livestock species are raised in large outdoor habitats that pose technological challenges for computer vision approaches. This review provides a comprehensive overview of computer vision methods and open challenges in outdoor animal monitoring. We include research from both the livestock and wildlife fields in the review because of the similarities in appearance, behaviour, and habitat for many livestock and wildlife. We focus on large terrestrial mammals, such as cattle, horses, deer, goats, sheep, koalas, giraffes, and elephants. We use an image processing pipeline to frame our discussion and highlight the current capabilities and open technical challenges at each stage of the pipeline. The review found a clear trend towards the use of deep learning approaches for animal detection, counting, and multi-species classification. We discuss in detail the applicability of current vision-based methods to PLF contexts and promising directions for future research.